


The Open 


University 


Faculty of Science, Technology, Engineering and Mathematics 
M140 Introducing statistics 


M140 


TMA 01 2019B 


Covers Units 1 and 2 Cut-off date 20 March 2019 


Please read the Assessment Guide on the module website before beginning 
work on this TMA. You can submit your TMA either by post or 
electronically using the University’s online TMA/EMA service. 


This TMA is marked out of 50. Your overall score for this TMA will be the 
sum of your marks for each question. Note, however, that because the 
University requires TMAs to be marked out of 100, the mark returned to you 
by the University will actually be out of 100 (i.e. twice the total of marks on 
your TMA script). 


The marks allocated to each part of each question are indicated in brackets 
in the margin. 


Guidance about how to answer TMA questions is given in Subsection 7.2 of 
Unit 1. 


Note that the Minitab files that you require for this assignment should be 
downloaded from the ‘Assessment’ area on the module website. 


Copyright © 2018 The Open University WEB 06683 1 


12.1 


TMA 01 Cut-off date 20 March 2019 


You should be able to answer Questions 1 to 3 after you have studied Unit 1. 
You will need to use Minitab to answer Questions 2 and 3. 


You should be able to answer Questions 4 to 6 after you have studied Unit 2. 


Question 1 (Unit 1) -— 8 marks 


The following is an extract from the summary of an academic article 
describing a partly statistical study in education. 


This paper investigates the effects of class size on classroom 
interactions and pupil behaviour. It extends research by 
comparing effects on pupil classroom engagement and teacher 
pupil interaction, and examining if effects vary between primary 
and secondary schools. Systematic observations were carried out 
on 686 pupils in 49 schools. Multilevel regression methods were 
used to examine relationships between class size and observation 
measures. At primary and secondary levels, smaller classes led to 
pupils receiving more individual attention from teachers, and 
having more active interactions with them. Classroom 
engagement decreased in larger classes, but, contrary to 
expectation, this was particularly marked for lower attaining 
pupils at secondary level. Low attaining pupils can therefore 
benefit from smaller classes at secondary level in terms of more 
individual attention and facilitating engagement in learning. 


In parts (a)—(d), describe how the relevant sections of the above summary 
match up with each of the four stages of the statistical modelling diagram. 
In doing so, point out places where little or no detail is given as to how one 
or more of these stages were carried out. 


(a) Pose question: what questions were being addressed? 


(b) Collect data: what data were collected in order to address the questions 
posed? 


(c) Analyse data: what analysis of the data was done? 


(d) Interpret results: how were the results of the analysis interpreted? 
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Question 2 (Unit 1) -— 9 marks 


The Minitab file that you require for this question should be downloaded from 
the ‘Assessment’ area on the module website. 


The data used in this question are given in the Minitab worksheet 
class-size-14.mtw, which contains the average class size in 32 countries for 
2014 (Source: www.oecd.org/edu/education-at-a-glance). In the worksheet, 
the variable second14 contains the average class size in lower secondary 
education, and the variable primary14 contains the average class size in 
primary education, for each country in 2014. Run Minitab and open this 
worksheet. 


(a) Use Minitab to produce a scatterplot of class size in lower secondary 
education in 2014 (second14) on the vertical axis against class size in 
primary education in 2014 (primary14) on the horizontal axis. Include 
a copy of your scatterplot in your answer. Describe briefly the main 
features of the plot. 


(b) Educationalists are interested in the percentage change in class size 
from primary to lower secondary. Percentage change in class size is 
calculated as 


class size in lower secondary education — class size in primary education 





class size in primary education 


Use Minitab to create a new variable perchange14 that contains the 
percentage change in class size from primary education in 2014 to lower 
secondary education in 2014. Copy this column of values into your 
answer. 


(c) The data in perchange14 are reported to four decimal places. Do you 
think that this is an instance of spurious accuracy? Justify your 
answer. 


(d) Use Minitab to produce a scatterplot of percentage change 
(perchange14) on the vertical axis against class size in primary 
education in 2014 (primary14) on the horizontal axis. Describe briefly 
what the plot tells you. 


Question 3 (Unit 1) -— 7 marks 


The Minitab file that you require for this question should be downloaded from 
the ‘Assessment’ area on the module website. 


In Question 2 you considered some data about class sizes. These data are 
considered again in this question. 


Run Minitab and open class-size-14.mtw. 


(a) Produce the default stemplot of class size in primary education in 2014 
(primary14) using Minitab (i.e. the stemplot produced when the 
Increment field is left blank and the Trim outliers option is not 
selected). Include a copy of your plot in your answer. 


(b) Briefly describe the shape of the stemplot that you produced in part (a). 
Your answer should consider whether the distribution is unimodal, 
bimodal or multimodal, whether it is symmetric, right-skew or left-skew, 
and whether or not there are outliers. Justify your conclusions. 
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x 100. 


Question 4 (Unit 2) — 12 marks 
Do not use Minitab to answer this question. 


The class size data used in this question are the average class sizes in lower 
secondary education for 2005 from 32 countries (Source: 
www.oecd.org/edu/education-at-a-glance). The data are similar to those 
used in Questions 2 and 3. In those questions you considered data relating to 
2014 for the same 32 countries. 


The following stemplot has been obtained for the average class size in lower 
secondary education for 2005 from 32 countries. 
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(a) Use the stemplot to prepare a five-figure summary for the 2005 data. 
Show all your working. 5 


(b) Using your answer from part (a), calculate the interquartile range for 
the 2005 data. 1 


(c) Explain why, when measuring spread, it is better to use the 
interquartile range rather than the range for the 2005 data. 2 








(d) For the 2014 data, the five-figure summary for the average class size in 
lower secondary education is as follows. 


22 
n = 32 | 19 27 
14 48 
Calculate the interquartile range for the 2014 data. [1] 


(e) Compare the medians and the interquartile ranges for the two datasets. 


What does the comparison of medians tell you about class sizes in the 
two years? [3] 
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Question 5 (Unit 2) -— 6 marks 


The Open University has developed some statistical models that can 
generate information on how likely registered students are to complete all the 
assignments for a given module. This information is usually summarised as a 
number somewhere on a scale from 0 to 1 — with 0 meaning very unlikely to 
complete, and 1 meaning very likely to complete. Figure 1 is a boxplot 
showing this information for all the students registered for M140 in a 
previous year. 


Boxplot of Likelihood of completing assignments of M140 
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Likelihood of completing assignments of M140 


Figure 1 


(a) Comment on what the boxplot tells you about the shape of the batch of 
data. In your answer you should consider the symmetry or otherwise of 
the batch of data, as well as any potential outliers. Justify your answer. [4] 


(b) The statistical model that generates the likelihood of completing 
assignments on M140, for each student, uses data available at the start 
of the module that reflect each student’s motivation, opportunity, 
determination, resilience and interest in study. Where would you place 
yourself — in terms of the median, quartiles and whiskers — on the 
boxplot? Briefly justify your answer. [2] 
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Question 6 (Unit 2) — 8 marks 


A new data science company wishes to track changes in the price of energy 
(namely, gas and electricity prices) that it uses for heating, lighting and 
powering electrical and other equipment, as it affects the business from year 
to year. The data for 2015 and 2016 are displayed in Table 1. 








Table 1 

2015 2016 
Gas (pence per kilowatt-hour) 3.454 4.152 
Expenditure on gas (£000) 135.2 128.9 
Electricity (pence per kilowatt-hour) 12.182 12.964 
Expenditure on electricity (£000) 358.1 389.4 





(a) Calculate the (chained) price index for 2016, taking 2015 as the base 
year. Round your answer to four significant figures. [4] 


(b) Use the value that you worked out in part (a) to interpret how changes 
in energy prices have affected this company. [2] 


(c) The (chained) price index for 2015 was 115.4, taking 2014 as the base 
year. Calculate the (chained) price index for 2016, taking 2014 as the 
base year. Round your answer to four significant figures. [2] 
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